46 research outputs found
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
Nowadays, open-source large language models like LLaMA have emerged. Recent
developments have incorporated supervised fine-tuning (SFT) and reinforcement
learning fine-tuning (RLFT) to align these models with human goals. However,
SFT methods treat all training data with mixed quality equally, while RLFT
methods require high-quality pairwise or ranking-based preference data. In this
study, we present a novel framework, named OpenChat, to advance open-source
language models with mixed-quality data. Specifically, we consider the general
SFT training data, consisting of a small amount of expert data mixed with a
large proportion of sub-optimal data, without any preference labels. We propose
the C(onditioned)-RLFT, which regards different data sources as coarse-grained
reward labels and learns a class-conditioned policy to leverage complementary
data quality information. Interestingly, the optimal policy in C-RLFT can be
easily solved through single-stage, RL-free supervised learning, which is
lightweight and avoids costly human preference labeling. Through extensive
experiments on three standard benchmarks, our openchat-13b fine-tuned with
C-RLFT achieves the highest average performance among all 13b open-source
language models. Moreover, we use AGIEval to validate the model generalization
performance, in which only openchat-13b surpasses the base model. Finally, we
conduct a series of analyses to shed light on the effectiveness and robustness
of OpenChat. Our code, data, and models are publicly available at
https://github.com/imoneoi/openchat
Porous single crystalline-like titanium dioxide monolith with enhanced photoelectrochemical performance
Macro-sized porous single crystalline-like (PSC-like) TiO2 is endowed with unique structural advantages due to its structural consistency and porosity in a large area, which would significantly enhance its photoelectrochemical function. However, there are significant technical challenges in the growth of porous single crystalline-like monoliths. The consistency of structure dominates the structure so that the grain boundary is reduced to the minimum, which is in contradiction with the three-dimensional percolation structure. Here we report a lattice reconstruction strategy based on solid-solid transformation to grow porous single crystal-like anatase TiO2 dominated by (200) and (101) facets at 2 cm scale. In comparison with the traditional definition of porous single crystal, it has two different lattice orientations, but still has good photoelectrochemical properties. The band gap engineering introduces Ti3+ gap into the lattice to generate TinO2n−1 with Magneli phase, limiting the created active structure to the lattice with two-dimensional surface, which would open a new avenue to create highly active surfaces to capture photons and transport electrons stably. The PSC-like TinO2n−1 provides enhanced exciton lifetime (3–5 ns) as a photocatalytic catalyst and shows significant visible light absorption. The independent PSC-like TinO2n−1 delivers high photocurrent of 1.8–5.5 mA · cm−2 at room temperature and does not decay for 10 h
A genetic variation map for chicken with 2.8 million single-nucleotide polymorphisms
We describe a genetic variation map for the chicken genome containing 2.8 million single-nucleotide polymorphisms ( SNPs). This map is based on a comparison of the sequences of three domestic chicken breeds ( a broiler, a layer and a Chinese silkie) with that of their wild ancestor, red jungle fowl. Subsequent experiments indicate that at least 90% of the variant sites are true SNPs, and at least 70% are common SNPs that segregate in many domestic breeds. Mean nucleotide diversity is about five SNPs per kilobase for almost every possible comparison between red jungle fowl and domestic lines, between two different domestic lines, and within domestic lines - in contrast to the notion that domestic animals are highly inbred relative to their wild ancestors. In fact, most of the SNPs originated before domestication, and there is little evidence of selective sweeps for adaptive alleles on length scales greater than 100 kilobases